skip to main content


Search for: All records

Creators/Authors contains: "Appling, Alison"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available August 1, 2024
  2. Process-based modelling offers interpretability and physical consistency in many domains of geosciences but struggles to leverage large datasets efficiently. Machine-learning methods, especially deep networks, have strong predictive skills yet are unable to answer specific scientific questions. In this Perspective, we explore differentiable modelling as a pathway to dissolve the perceived barrier between process-based modelling and machine learning in the geosciences and demonstrate its potential with examples from hydrological modelling. ‘Differentiable’ refers to accurately and efficiently calculating gradients with respect to model variables or parameters, enabling the discovery of high-dimensional unknown relationships. Differentiable modelling involves connecting (flexible amounts of) prior physical knowledge to neural networks, pushing the boundary of physics-informed machine learning. It offers better interpretability, generalizability, and extrapolation capabilities than purely data-driven machine learning, achieving a similar level of accuracy while requiring less training data. Additionally, the performance and efficiency of differentiable models scale well with increasing data volumes. Under data-scarce scenarios, differentiable models have outperformed machine-learning models in producing short-term dynamics and decadal-scale trends owing to the imposed physical constraints. Differentiable modelling approaches are primed to enable geoscientists to ask questions, test hypotheses, and discover unrecognized physical relationships. Future work should address computational challenges, reduce uncertainty, and verify the physical significance of outputs. 
    more » « less
    Free, publicly-accessible full text available July 11, 2024
  3. null (Ed.)
    Basin-centric long short-term memory (LSTM) network models have recently been shown to be an exceptionally powerful tool for stream temperature (Ts) temporal prediction (training in one period and making predictions for another period at the same sites). However, spatial extrapolation is a well-known challenge to modeling Ts and it is uncertain how an LSTM-based daily Ts model will perform in unmonitored or dammed basins. Here we compiled a new benchmark dataset consisting of >400 basins across the contiguous United States in different data availability groups (DAG, meaning the daily sampling frequency) with or without major dams and studied how to assemble suitable training datasets for predictions in basins with or without temperature monitoring. For prediction in unmonitored basins (PUB), LSTM produced an RMSE of 1.129 °C and R2 of 0.983. While these metrics declined from LSTM's temporal prediction performance, they far surpassed traditional models' PUB values, and were competitive with traditional models' temporal prediction on calibrated sites. Even for unmonitored basins with major reservoirs, we obtained a median RMSE of 1.202°C and an R2 of 0.984. For temporal prediction, the most suitable training set was the matching DAG that the basin could be grouped into, e.g., the 60% DAG for a basin with 61% data availability. However, for PUB, a training dataset including all basins with data is consistently preferred. An input-selection ensemble moderately mitigated attribute overfitting. Our results indicate there are influential latent processes not sufficiently described by the inputs (e.g., geology, wetland covers), but temporal fluctuations are well predictable, and LSTM appears to be a highly accurate Ts modeling tool even for spatial extrapolation. 
    more » « less
  4. This dataset includes model configurations, scripts and outputs to process and recreate the outputs from Ladwig et al. (2021): Long-term Change in Metabolism Phenology across North-Temperate Lakes. The provided scripts will process the input data from various sources, as well as recreate the figures from the manuscript. Further, all output data from the metabolism models of Allequash, Big Muskellunge, Crystal, Fish, Mendota, Monona, Sparkling and Trout are included. 
    more » « less
  5. This dataset includes model configurations, scripts and outputs to process and recreate the outputs from Ladwig et al. (2021): Long-term Change in Metabolism Phenology across North-Temperate Lakes. The provided scripts will process the input data from various sources, as well as recreate the figures from the manuscript. Further, all output data from the metabolism models of Allequash, Big Muskellunge, Crystal, Fish, Mendota, Monona, Sparkling and Trout are included. 
    more » « less
  6. null (Ed.)
    Stream water temperature (Ts) is a variable of critical importance for aquatic ecosystem health. Ts is strongly affected by groundwater-surface water interactions which can be learned from streamflow records, but previously such information was challenging to effectively absorb with process-based models due to parameter equifinality. Based on the long short-term memory (LSTM) deep learning architecture, we developed a basin-centric lumped daily mean Ts model, which was trained over 118 data-rich basins with no major dams in the conterminous United States, and showed strong results. At a national scale, we obtained a median root-mean-square error (RMSE) of 0.69oC, Nash-Sutcliffe model efficiency coefficient (NSE) of 0.985, and correlation of 0.994, which are marked improvements over previous values reported in literature. The addition of streamflow observations as a model input strongly elevated the performance of this model. In the absence of measured streamflow, we showed that a two-stage model can be used where simulated streamflow from a pre-trained LSTM model (Qsim) still benefits the Ts model, even though no new information was brought directly in the inputs of the Ts model; the model indirectly used information learned from streamflow observations provided during the training of Qsim, potentially to improve internal representation of physically meaningful variables. Our results indicate that strong relationships exist between basin-averaged forcing variables, catchment attributes, and Ts that can be simulated by a single model trained by data on the continental scale. 
    more » « less